home *** CD-ROM | disk | FTP | other *** search
-
-
-
-
- Turbo Pascal Record Compress Procedure
-
- Carl A Franz
- JFL Consulting
- "We will sell no software
- before it's written"
- 1115 S. Ridgeland
- Oak Park, Il. 60304
- (708) 383-1546
- CServe: 71041,1512
-
-
-
- When UnZiping the COUN.ZIP file you should have received:
- 1) COUN.PAS - The Compress/Uncompress source.
- 2) TESTCOUN.PAS - Demonstration program for COUN.
- 3) COUN.DOC - This file.
-
- Quite frankly, this is only useful if you have a database like
- BTrieve or TBTree which allows you to have variable length records in a
- database.
-
- If you can't afford BTrieve (I can't), try TBTree written by a guy
- named Dean Farwell 73240,3335. I get nothing from Dean to plug his
- product, so my opinion of this product is untainted. It's great. You
- can put up a well designed database with Turbo Pascal and the TBTree
- product. It's much better then the Borland Database toolkit. I think
- it's about $25 now. A heck of a bang for your buck.
-
-
- Anyway, on to this product. The routines in COUN compress out
- space in your records by removing the extra space in the STRING
- variables. For instance, if you have a record for an address book like
- the following:
-
- AddrBook = Record
- NAME : String[40];
- ADDR1 : String[40];
- ADDR2 : String[40];
- City : String[25];
- St : String[2];
- Zip : String[9];
-
- You have allocated 162 bytes, however, rarely is all that space used
- for actual data. For instance, my name and address use all of 64
- bytes. That's a lot of wasted space. Since I am, in fact, building an
- address book of sorts, and I am planning on keeping several record
- types in one file, I figured I needed to save some space. Thus I wrote
- these Compress/Uncompress routines for Turbo Pascal Records.
-
-
-
-
- How it works:
-
-
- There are 2 routines.
-
- FUNCTION Compress(CMap : STRING; VAR InData; VAR OutData) : INTEGER;
-
- This function accepts a map of your Pascal record (CMap), your
- record (InData), and someplace you want the compressed record
- information to go (OutData). I highly suggest that the field you use
- for OutData be a byte array as large as the Record you are
- compressing. The function then returns the length of the compressed
- record.
-
- PROCEDURE UnCompress(CMap : STRING; VAR InData; VAR OutData);
-
- This procedure accepts a map of the record (CMap), your compressed
- byte array (Indata), and your record (OutData). I had been
- considering swapping positions of InData and OutData so that the
- calling conventions are the same for COMPRESS and UNCOMPRESS but
- didn't. If you want to, go ahead, you've got the source code.
-
-
- CMap, the record map is the most complicated part of this mess. To
- compress and uncompress you record, I need to know what it looks like.
- To do this is fairly simple. I use the word 'fairly' advisedly.
-
- Referring to the Address Book record, the CMap would be
- 'S40S40S40S25S2S9'. You should get an idea from that. Basically, you
- tell me, in short hand, what the fields in the record are. To wit:
-
- I = INTEGER; 2bytes (Case is irrelevant)
- L = LONGINT; 4
- R = REAL; 6
- B = BYTE; 1
- S = STRING;
- C = CHAR; 1
- P = POINTER; 4
- W = WORD; 2
-
- Types not supported are: enumerated type, single, double, or
- comp floating-point types, and set
- types.
-
- 'S' may have a length behind it to define the declared length of
- the string: ie. STRING[40] is 'S40'. If there is no length following
- the string identifier 'S', I assume the length is 255 bytes, the
- length of a string defined STRING.
-
- A number may be used to define a length of data. If you have 5
- byte fields in a row, you can either have them defined as 'BBBBB' or
- '5'. Likewise, if a record contains 2 Integers and a pointer you may
- define them as 'IIP' or '8'. If you have a STRING[40] followed by 5
- byte fields, you must separate with a comma (','), i.e. 's40,5'. Lets
- face it 'S405' makes no sense. Also, an 'S' followed by a number that
- is not the strings length must be seporated by a comma.
-
-
-
-
-
- IE. if you have a field defined STRING followed by 5 BYTE fields the
- 'S5' would be assumed to be a 5 byte string, 'S,5' is a 255 byte
- string followed by 5 bytes of whatever.
-
-
-
-
- So, you say you've got arrays. I can handle that. Lets say you'd
- defined a record thus:
-
- Rec = Record
- StrArray : array [1..25] string[40];
-
- No problem. Arrays can be defined by brackets ('[',']'). A left
- bracket '[' followed by the number of items in the array starts an
- array definition and an right bracket ']' ends it. To Wit: '[25s40]'
- defines an array of 25 40 byte strings (Array [1..25] STRING[40]).
-
- Arrays can also be nested up to 100 levels deep. Actually, I've
- allowed for 100 levels in my tables but realistically you may have
- only 100 symbols of any kind in the CMap string. If you find a need
- to expand the limits, go ahead. The type definitions L1 and L2 are
- where to change them. These are the Cmap parse tables.
-
-
- There are two fields for flagging errors:
-
- 1) COUNERR an integer where:
- 1 is a memory allocation error.
- 2 is a invalid pnumonic error.
- (I don't recognize a record map token character)
- COUNWHR tells you the character position.
- 3 is a bracket mismatch error.
- 4 Cmap is too big.
-
- 2) COUNWHR an integer field defines the CMap string that
- caused the trouble.
-
- There are a several limits as to what I allow. Records can only
- be 32000 bytes long. Also, like I said above, the maximum CMAP length
- is 100 characters. Multi-dimentional arrays are not supported. Oh,
- you can do it by defining a nested array, but I wouldn't try to define
- a multi-dimentional array which contains strings unless you really under-
- stand how Turbo Pascal allocates memory.
-
- A note about the previous paragraph: There are no good reasons for
- most of the limitations. I just didn't need anything bigger. If,
- however, you do deceide to make the Byte Array bigger, there are Turbo
- Pascal limitations. Integers go to +32K so indexes need to be changed
- to LongInt or Word. I'm not sure how big arrays can be, but there is
- a limit, look it up. Also, the obvious limit to the CMAP is 255
- characters. If you come up with any interesting ways around that, let
- me know.
-
-
-
-
-
-
-
- When compiled the COUN.PAS unit uses 2264 bytes of code and 53
- bytes of data space. If there is enough interest in this (or if Dean
- Farwell askes me to) I will convert this to TASM assembler. It should
- then be faster and smaller. If someone else wants to do it, that's fine
- also. Please send me the code when your done.
-
-
- The source code is provided for several reasons. 1) I like to see
- what other people are doing, I assume others do too. 2) If someone
- comes up with nifty a way of making these routines faster, smaller,
- more elegant, whatever, I would like to know.
-
- If you use these routines, I don't want money. Well, yes I do. If
- you feel like sending me a fiver, go ahead. What I really want is to
- know if anyone finds them useful. Drop me a note. I enjoy chatting
- with others in the field.
-
- If you, God forbid, find any bugs in these routines, please let me
- know. I will fix them and get a new version out to you ASAP. I'm
- very proud of my work, so I really do try hard to provide the best
- time will allow. Also, try fixing them yourself, it's good practice.
-
- I have a 20 month old child in the house so no late night calls.
- Anything after 10pm CST and I'll probably get quite angry. You're much
- more likely to get me via CompuServe then calling by phone. But if
- you must, evenings and weekends are the best time. I do not, under any
- circumstances, accept collect calls. Deal with it.
-
- Biography: (I saw it in someone elses doc and thought it was a good
- idea)
-
- Carl Franz has been in programming for 13 years. He has written
- code professionally for Univac, Burroughs, DEC, IBM Mainframe, Z80
- CP/M, and IBM PC. Currently I'm a Technical Advisor for a commercial
- bank. I consult on the side when the mood hits me. The JFL in JFL
- Consulting stands for 'Just For Laughs' (not really, but you get the
- point). Need a utility written, give me a buzz, if it sounds like fun
- we can work something out.
-
- Yet again I'm going to plug TBTree. The next version will provide
- Network support. It already provides fixed and variable length record
- support, record lists, keys of Turbo Pascal variable types, so-so
- documentation but good example programs. Last I looked, it was in
- BPROGA Lib 2. It's a big download (about 300K) but worth it. All
- source code provided. And, for goodness sake, pay the man his $25, it
- isn't alot for what you are getting and he needs to know if anyone is
- really using the product. On top of which, as far as I can tell it's
- bugless.
-
-
- Good luck and may the farce be with you.
-
-
-
-
-
- For Algorythm Freaks
-
-
-
-
- The algorythm for the process is kind of brainless. (Brainless
- means 'Why didn't I think of that earlier'). Basicly, there are 2
- tables: 1) L1 absorbes all necessary information about the Tokens
- in the CMAP table, 2) L2 allow me to stack Array-Start information to
- handle nested arrays.
-
- At its vary basics there are 6 token types: 1) 'S' or string with
- an optional length; 2) scaler lengths (numeric values); 3) any of the
- rest of the pnumonics which refer to Pascal Types; 4) The start array
- left bracket '[' plus iteration value; 5) the end array right bracket
- ']'; and 6) the lowly comma.
-
- ParseCMap calls GetToken at the start of each loop. GetToken
- looks at the next value in CMap and loads LP1T with the token type,
- whatever the character is, and a length. The length all Pascal Types
- is gotten via a SizeOf, except String (S) for which GetNum is called
- to check if there is a numeric character after the 'S'. If there is a
- numeric character, it absorbs characters from CMap until a non-numeric
- value is found converting the mess into an integer. LP1T is later
- copied to the next item in the L1 table.
-
- The '[' or Start-Array does something a little different. For the
- most part it works the same as the String 'S' token. Except, when one
- is found an entry is made onto stack L2. The entry consistes of the
- index value of where the '[' entry is in L1. As '['s are found, each
- is pushed onto the stack. When an Array-End ']' token is found, an
- entry in the L2 stack is poped. This entry contains the index
- location of the matching Start-Array. The Size component of L1 is
- then loaded with the location of the matching Start-Array so that when
- they are finally processed you will know which entry of the L1 table
- to return to for iteration.
-
- I have to apologize about the naming conventions. I was rereading
- some notes on expression parsing and evaluation from college which
- used the same stupidly cryptic conventions. I wasn't feeling
- particularly creative at 2:30am so I used them instead of making up
- better ones.
-
- On the Compress/DeCompress side you step thru the L1 array and do
- what it says. Except. When a Start-Array is found the iteration
- count is moved from Size to Decr. Then, upon seeing an End-Array
- the Decr of the matching Start-Array is checked: If 0 then nothing is
- done and processing continues to the next item; else if Decr is not
- zero it is decremented and the index address of the matching
- Start-Array is loaded to the L1 index. Remember that the L1 index
- will be incrimented before checking the next L1 entry so the
- Start-Array will not actually be processed again during the 'array
- loop'. Also, each Start-Array has its own Decr, thus nested array
- will process properly.
-
- There is a slightly more effecient way of handling the Array loops,
- however it involves another integer in L1 and some somewhat more
- complicated code. Also, I have a blind spot figuring out where I
- should be with indexes. I'm alway one ahead or behind where I should
- be.
-
- ----------------end-of-author's-documentation---------------
-
- Software Library Information:
-
- This disk copy provided as a service of
-
- The Public (Software) Library
-
- We are not the authors of this program, nor are we associated
- with the author in any way other than as a distributor of the
- program in accordance with the author's terms of distribution.
-
- Please direct shareware payments and specific questions about
- this program to the author of the program, whose name appears
- elsewhere in this documentation. If you have trouble getting
- in touch with the author, we will do whatever we can to help
- you with your questions. All programs have been tested and do
- run. To report problems, please use the form that is in the
- file PROBLEM.DOC on many of our disks or in other written for-
- mat with screen printouts, if possible. The P(s)L cannot de-
- bug programs over the telephone.
-
- Disks in the P(s)L are updated monthly, so if you did not get
- this disk directly from the P(s)L, you should be aware that
- the files in this set may no longer be the current versions.
-
- For a copy of the latest monthly software library newsletter
- and a list of the 2,000+ disks in the library, call or write
-
- The Public (Software) Library
- P.O.Box 35705
- Houston, TX 77235-5705
- (713) 524-6394
-